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Abstract: Expressed sequence tags (ESTs) are generated from sin¬ 
gle-pass sequencing of randomly picked cDNA clones and can be used 
for development of simple sequence repeat (SSR) markers or microsatel¬ 
lites. However, EST databases have been developed for only a small 
number of species. This paper provides a case study of the utility of 
freely available birch EST resources for the development of markers 
necessary for the genetic analysis of Betula luminifera. Based on birch 
EST data, primers for 80 EST-SSR candidate loci were developed and 
tested in birch. Of these, 59 EST-SSR loci yielded single, stable and clear 
PCR products. We then tested the utility of those 59 markers in B. lumi¬ 
nifera. The results showed 28 (47.6%) yielded stable and clear PCR 
products for at least one B. luminifera genotype. In addition, this study 
describes a rapid and inexpensive alternative for the development of 
SSRs in species with scarce available sequence data. 
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Introduction 

Molecular markers have broad uses in genetic research (e.g. 
construction of genetic maps and gene mapping), breeding (e.g. 
molecular marker-assisted selection), gene cloning, and com¬ 
parative genomics. Many types of molecular markers have been 
developed since restriction fragment length polymorphisms 
(RFLP) were developed in 1980 (Botstein et al. 1980). Up to 
now the more powerful and available markers are those based on 
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PCR techniques. In short, there are two types of PCR-based 
markers; one is random primer markers which can be used in 
nearly all kinds of species, examples are randomly amplified 
polymorphic DNAs (RAPDs) (Williams et al. 1990) and ampli¬ 
fied fragment length polymorphisms (AFLPs) (Vos et al. 1995). 
The other is special primer markers such as SSR (simple se¬ 
quence repeat), which must be developed from and used in target 
species (Becker et al. 1995). Random primer markers are more 
flexible because they can be used in nearly all kinds of species. 
But their reliability, especially in RAPDs, is in doubt in some 
extent. 

SSRs have proven to be more reliable than other markers. 
SSRs consist of tandem repeats of short (1-6 bp) nucleotide 
motifs (Gupta et al. 1996). These repeat sequences are distrib¬ 
uted throughout the genome. Polymorphism revealed by SSRs 
results from variation in repeat number, which primarily results 
from slipped-strand mispairing during DNA replication. Thus, 
SSRs reveal much higher levels of polymorphism than most 
other marker systems (Toth et al. 2000; Li et al. 2002). Although 
the utility of SSRs in genetics studies is well established, the 
isolation and characterization of such markers by traditional 
methods is costly and time consuming, which makes the de novo 
development of SSRs unrealistic for some taxa (Becker et al. 
1995; Pashley et al. 2006). 

Expressed sequence tags (ESTs) are generated from sin¬ 
gle-pass sequencing of randomly picked cDNA clones (Adams et 
al. 1991). The EST approach and subsequent gene-expression 
profiling (cDNA microarrays) have proven to efficiently identify 
genes and analyze their expression during different developmen¬ 
tal stages or under various environmental stresses (Fowler et al. 
2002; Wei et al. 2005). With the recent progress made in 
large-scale plant function genome sequencing project, thousands 
of data sets have been generated, and importantly, most of these 
are freely available for use by any plant biologist word wide 
(Brady et al. 2009). These ESTs are useful for developing SSR 
markers (EST-SSR). Thus, the use of such database to develop 
SSRs could be an inexpensive and rapid alternative to traditional 
methods (Wang et al. 2005). At present there remain many spe¬ 
cies with no EST data. At the same time, those species need more 
efficient molecular markers to support research into their biology. 
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According to Yang et al. (2007), EST-derived markers are likely 
to be conserved across a broader taxonomic than any other sorts 
of marker. For those species with scarce sequence information, it 
may be feasible to develop EST-SSR markers using EST data for 
closely related taxa (Lu et al. 2006). This study explores that 
strategy. 

Pashley et al. (2006) described a case study in which the pub¬ 
licly available cultivated sunflower EST database was used to 
develop SSR markers for use in the genetic analysis of another 
two sunflower species. The result showed that EST-derived SSR 
were more than three times as transferable across species as 
compared with genome SSR. Moreover, EST-SSRs whose prim¬ 
ers were located within protein-coding sequence were more 
readily transferable than those derived from untranslated regions. 
This survey revealed that more than one-third of all plant-derived 
EST collections of sufficient size could conceivably serve as a 
source of EST-SSRs for the analysis of rare, endangered, or in¬ 
vasive plant species worldwide. 

Up to now, tree biology research, in contrast with the others, 
lags behind. The databases of tree sequence data are limited. For 
many tree species, there is no DNA sequence information. This 
lack of data makes it difficult to develop locus specific primers 
for those species. There is narrow choice of markers for biology 
research in those species. It is useful and necessary to develop 
and test more markers for those species. 

Betula luminifera is a deciduous tree, widely occurring in 
temperate zone in China. As lots of other tree species, its biology 
and sequence data are limited. Birch ( Betula platyphylla Suk), 
which has lots of sequencing and biological information in pub¬ 
lic databases, is closely related to B. luminifera. This paper pro¬ 
vides a case study of the utility of freely available birch EST 
resources for the development of markers necessary for genetic 
analyses of if luminifera. 

Materials and methods 

Search of putative SSR 

In total, 3 028 ( B. platyphylla) EST sequences of birch, released 
by the plant GDB ( http://www.plantgdb.org ) were examined. We 
used SSR Finder 

(http://www.gramene.org.gramene/searches/ssrtool) to search 
SSR among these ESTs. Those ESTs including a 2-4 bp repeat 
motif were select as putative SSR. 

Targeting of candidate SSR by electronic PCR 

To develop candidate SSR markers from the SSRs identified 
with SSR Finder, we designed PCR primers based on flanking 
regions on the EST sequences using ePrimer3 
(http://www.hgmp.mrc.ac.uk). For convenience, we used a 
200-bp cDNA sequence with 100 bp on each side of the target 
split for the primer design for each EST-SSR. Then we tested the 
designed primers by electronic PCR (e-PCR; Schuler 1997) on 
the birch EST sequence. To increase the quality and usability of 
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the in silico exploited SSR markers, we required exact matches 
between primers and templates and set a 600-bp length on the 
product size for the e-PCR. We accepted a putative EST-SSR 
locus as a candidate EST-SSR marker only if those primers suc¬ 
cessfully and uniquely amplified the correct target in the e-PCR. 
The candidate primers were selected and named with the abbre¬ 
viation BES (for B. luminifera EST-SSR) followed by a unique 
number (e.g. BES 18). 

Verification and evaluation of SSR markers in Betula luminifera 

One birch, as well as two B. luminifera varieties including 
Lin’an5 and Sichuan4 which represent two different ecotypes, 
were used to verify the candidate SSR markers. Total genomic 
DNA was isolated from 200 mg of fresh leaf tissue using CTAB 
method (Murray et al. 1980). 

All primers used were synthesized by Nanjing Jinsite Bio¬ 
logical Engineering & Technology Company in China. PCR was 
performed in 20 pL reactions containing 50 ng of template DNA, 
0.5 pmol/L of each primer, 200 pmol/L of each dNTP, 1.5 
mmol/L of MgCl 2 , 1 unit of Taq polymerase, and 2 pL of 10 x 
PCR reaction buffer. A touchdown PCR program (Don et al. 
1991) was used: 5 min at 95°C; 10 cycles of 30 s at 95°C, 30 s at 
58°C minus 0.3°C/cycle, 1 min at 72°C; 20 cycles of 30 s at 
95°C, 30 s at 55°C, 1 min at 72°C; and 7 min at 72°C for a final 
extension. For those primer pairs that did not generate good am¬ 
plification results, the initial annealing temperatures were ad¬ 
justed from 55°C to 60°C. Each of the primer pairs was tested 
twice to confirm the repeatability of the observed bands in each 
genotype. PCR products were separated on agarose gel. Gels 
were stained with Ethidium Bromide for visualizing DNA bands. 

Sequencing PCR product 

To confirm the PCR product amplified in B. luminifera were 
homologous to the birch genes where the loci were first identi¬ 
fied, a band yielded by primer BES 17 in Lin’an was isolated, 
purified and sequenced. 

Results and discussion 

Candidate SSR markers 

By screening 3 028 EST sequences from birch with SSR Finder, 
we identified 331 ESTs carrying SSR motifs. These results 
showed that about 10.9% ESTs have SSR, we successfully ob¬ 
tained 151 (45.6%) e-PCR products from those 331 putative SSR 
loci. Those that did not yield a product were the result of too 
short sequences flanking the SSRs. With these 151 primers, we 
successfully obtained 333 e-PCR products. This result showed 
that there are multiple BAC clones or overlap of ESTs obtained 
same PCR product. Multiple-copy markers are not desirable for 
genetic studies so we discarded these primer pairs. At last, 80 
candidate SSR markers were selected for further development. 
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Experimental tests of candidate SSR markers 

Because the primers were designed from birch EST data, they 
must be tested for application in B. luminifera by experiment. All 
primers were tested on birch DNA preparations first (Fig. 1). Of 
the 80 candidate SSR markers tested, 68 (85%) yielded stable 
and clear PCR products as expected in birch. We suspect that the 
main reason for amplification failure of the remaining 12 candi¬ 


date loci is that the species of birch used in this study is different 
from that from which the EST data were generated. Among the 
68 candidate SSR markers tested, 59 (86.8%) yielded single, 
stable and clear PCR products as expected in birch. We then 
tested the utility of those 59 markers in B. luminifera (Fig. 2). 
The results showed 28 (47.6%) yielded stable and clear PCR 
products in at least one B. luminifera genotype. The primers for 
these new B. luminifera SSR markers and their names are shown 
in Table 1. 


Tablel. Twenty-eight EST-SSR markers developed for Betula luminifera 


Primer name ID in PlantGDB [16 ] Former primer 5-3 Reverse primer 5-3 


BES2 

gi34389440 

BES3 

gi34389519 

BES7 

gi34388540 

BES10 

gi34389631 

BES15 

gi34389670 

BES17 

gi 34388669 

BES18 

gi34389738 

BES19 

gi34389809 

BES20 

gi34388674 

BES22 

gi34389885 

BES23 

gi34389943 

BES24 

gi34389962 

BES25 

gi34388873 

BES27 

gi34390086 

BES29 

gi34390130 

BES31 

gi34388686 

BES33 

gi34390195 

BES35 

gi34390304 

BES36 

gi34388710 

BES39 

gi34390489 

BES40 

gi34390566 

BES45 

gi34390675 

BES48 

gi34390740 

BES60 

gi34388842 

BES62 

gi34388871 

BES67 

gi34389051 

BES70 

gi34389144 

BES80 

gi34389429 


GCCGGGGAAGAAAGTTACC 
CCAACAGGCTTTCATTTGCT 
TCTCTTCCCGCAAACTCTCT 
GAATGTTCTCTGCTCCTCCAG 
CCCCCTCCCTTTTACTCTTTC 
TCT C ACC AAACC ACT C ACT C A 
CAGACGACAAAGCAAGCTGA 
GCGACACACCCTACCATCTT 
GGTT GCT C AACCT AACC AAC A 
AGGGT GTT C AA ACCG ACG A 
GCACTCACTCGGATACTCGTC 
GCCGGGAGAATTACACGTC 
T AG AGCGTT GCGC AG AT AG A 
TTTATTTTTCAATTTTTCCTAGAGAGG 
GCG AC AGG AAATT C A ACC AC 
TGAATAGACCGTTGCGCTTA 
G AG AG AACC AAAAC AGT AG AC AG AG A 
GGGGGTTGCTCTTCATTTTT 
CGCCAAATCTTTACCCAGAA 
CGGGGG AC ATT AC AAAT AGC 
CCCTGCCTCTTCTCTGTCAC 
C AGCGT AT G AAACC AG AACG 
GTTAAGAAGGTGCGCCAGTC 
GGACTTCTTCGGAGACATGG 
CCTCTCTGGCTTCTCCTCCT 
CGGGGG AACC AT C AAAAA 
GGCGTTTAATCTGGGTGAGA 
TCAGCTTCGTTCCAAAACCT 


C ACGTT GGG AAT GT GAT GAT 
AT C AGGGGC AT C A AC A AG AG 
ATAAACCGCCCAGGAAAAAC 
TCACTATTCGGTGCAACAGG 
TTCTGCTCCCGTCTCATCTT 
AAGAGCGT GGCAATGAACTC 
CAT GCTC AC AT AC AAGGC AAA 
GGT GC ACTT GC AG AT GT GAT 
AG AAC ACCC ACC AAGTC ACC 
CGGTCTCAATCTCCACGTTT 
CTTTT GC ACC AT GTTT GT GG 
CCCCTTTCTTCAGATCAACG 
CAGGTTCCTCTCCTCCACTG 
ACCACACCGAGGCATACAAT 
CT GCGT C AG ACT GC AC ATTT 
CGTATCTCTCGGCTTGCTCT 
GGCCT GTTCTT GAT G ACG AT 
GGTTTCCTCGTCGGTTATGA 
CG ACG AT GAT G ATCC AT GAG 
TCGCATCTTCATCTGTGAGG 
GCCATAAGCCTCCAATCTCA 
TAAAACGGACCCACTTGAGC 
ACT AACCGCGCATAAACTGC 
CCCCAGAAAATAACGGCATA 
TCGAATCCATATCCACCAAAA 
CGGG A AGTCGC AT AT AGG AA 
ACGCCAGAATGGTAGACACC 
CCC ATTT GG AG AT GG AG AAA 


Ml 2 3 4 5 6 7 8 9 10 LI 12 13 14 15 16 



Fig. 1 Some of the primers screened in birch separated by electro¬ 
phoresis on 1% agrose gel (M Marker DL2000) 


M 1 2 3 A 5 6 7 8 9 1011 12 13141516 



Fig. 2 Some of the primers screened in B. luminifera separated by 
electrophoresis on 1% agrose gel (M DNA Marker DL2000). Each 
two neighboring lanes (from 1 to 20) are separated by the same 
prime. Odd lanes are PCR products from Lin’an5 DNA templet; 
even lanes are PCR products from Xichuan4 DNA templet 


Homologous test 

To examine the homology of SSR markers amplified in B. lumi¬ 
nifera using primers derived from the EST sequence in birch, we 
randomly selected one PCR product yielded by primer BES17 in 
the Lin’an variety for DNA sequencing. BLAST results of the 
177 bp fragment (Fig. 3) showed that the sequence shared 88% 
identity with the EST (gi|34388669) that generated the SSR 
primer (BES17), which met our expectations (Fig. 4). 

CCCTTCC TCT AG TT AGCGCGGGTGCC ATGGC T AA AO AC ATGO AAAG TTOG AO GGC AACGT 
GG TTTCTOC CCCAAGG ACT ACC AC AACCC ACC AC CGGCG CCGCTG ATTGGTGC AC ACG AG 
CTCAG AATG TGGTC CTTCTAGCGGTC TATC ATTGCCT ACCTC ATTGC T ACGCTC AT A 

Fig. 3 Nucleotide sequence of PCR produce yielded by primer BES17 
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Query 7 CTCTAGTTAGOGCGGGTtX:CAT1XCTAAA3ACATCGAAAGTTGG^3GCAACGTCGTTTC 66 

inn ilium mi him ii iiiiiiiii iiiiiiiiiiii iiii hi 

Sbjct 64 CTCrTAATTACXXXX}AGnx:TATCXXXJAAXACAT(XJAA-GTTGGAGGGCAAAGTCG<jrTC 122 
Query 67 TCCOCCAAGGJCTACCACAAOCCAOCACCGGCGCCCXnGATTGGTCCACACGAGCTCAC A 126 

III llllllllllllll IIIII III I llllllll lllllll Mil I IIIIIIIII 

Sbjct 123 TC(XX:CAAGGA2TACCACGAOCCAOCAeOCXXXXX:GCrGATTGATCCAGAOGAGCTCACA 182 
Query 127 ATGTOTnXTTCTACOXTCTATCATTOXTATC 174 

I IIII III III111 II lllllllllll I lllllll MINI 

Sbjct 183 AAGTXTOCTICTACAGGGCTATCATIXXXXJAGITCATrcOCACGCTC 230 

Fig. 4 Nucleotide alignment between PCR product of BES17 and 
EST (gi|34388669) 

Good molecular markers are useful tools for genetic and biol¬ 
ogy studies. The study of plant biology, as with all areas of biol¬ 
ogy, has undergone dramatic changes in the past decade since the 
development high-throughput methods for sequence determina¬ 
tion. In recent years, high-density oligonucleotide re-sequencing 
microarrarys and next-generation sequencing technologies have 
resulted in a considerable increase in the amount of available 
genome sequence data (http://www.ncbi.nlm.nih.gov). However, 
there still lots of species, especially woody plants, have little for 
no publicly available sequence information. And at present, it is 
not feasible to sequence the genome of all the species. However, 
we verified in this paper, EST data could work very well for 
developing genetic tools for taxa closely related to EST se¬ 
quencing targets. We verified our primers in a large population 
within the Betula and they worked well (detailed results are the 
subjects of another article). With the help of these results, it is 
possible to follow similar strategies to develop SSR primers in a 
variety of species with scarce sequence data. Making EST-SSR 
marker could be a good choice for those species. 

In addition, because EST-SSRs are genetic markers residing in 
gene sequences, they can directly reflect aspects of variation 
within those genes. Therefore, the maps constructed with 
EST-SSR markers could be specially valuable for genetic studies. 
On the other hand, levels of polymorphism could be low because 
these SSRs are in expressed regions which may have more evo¬ 
lutionary constrains compared with SSRs derived from untran¬ 
scribed sequences. Nevertheless, EST-SSR marker can be useful 
for comparative genomic studies precisely because they are de¬ 
signed in expressed regions. Therefore it could be used in wide 
range, especially the analysis of linear relationship among dif¬ 
ferent genomes within same genus (Rong et al. 2004; Lu et al. 
2006). 
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